Search CORE

41 research outputs found

Computational role of eccentricity dependent cortical magnification

Author: Isik Leyla
Mutch Jim
Poggio Tomaso
Publication venue
Publication date: 06/06/2014
Field of study

We develop a sampling extension of M-theory focused on invariance to scale and translation. Quite surprisingly, the theory predicts an architecture of early vision with increasing receptive field sizes and a high resolution fovea -- in agreement with data about the cortical magnification factor, V1 and the retina. From the slope of the inverse of the magnification factor, M-theory predicts a cortical "fovea" in V1 in the order of

40

40

basic units at each receptive field size -- corresponding to a foveola of size around

26

minutes of arc at the highest resolution,

\approx 6

degrees at the lowest resolution. It also predicts uniform scale invariance over a fixed range of scales independently of eccentricity, while translation invariance should depend linearly on spatial frequency. Bouma's law of crowding follows in the theory as an effect of cortical area-by-cortical area pooling; the Bouma constant is the value expected if the signature responsible for recognition in the crowding experiments originates in V2. From a broader perspective, the emerging picture suggests that visual recognition under natural conditions takes place by composing information from a set of fixations, with each fixation providing recognition from a space-scale image fragment -- that is an image patch represented at a set of increasing sizes and decreasing resolutions

arXiv.org e-Print Archive

DSpace@MIT

How can cells in the anterior medial face patch be viewpoint invariant?

Author: Jim Mutch
Joel Z. Leibo
Tomaso Poggio
Publication venue
Publication date: 25/03/2011
Field of study

In a recent paper, Freiwald and Tsao (2010) found evidence that the responses of cells in the macaque anterior medial (AM) face patch are invariant to significant changes in viewpoint. The monkey subjects had no prior experience with the individuals depicted in the stimuli and were never given an opportunity to view the same individual from different viewpoints sequentially. These results cannot be explained by a mechanism based on temporal association of experienced views. Employing a biologically plausible model of object recognition (software available at cbcl.mit.edu), we show two mechanisms which could account for these results. First, we show that hair style and skin color provide sufficient information to enable viewpoint recognition without resorting to any mechanism that associates images across views. It is likely that a large part of the effect described in patch AM is attributable to these cues. Separately, we show that it is possible to further improve view-invariance using class-specific features (see Vetter 1997). Faces, as a class, transform under 3D rotation in similar enough ways that it is possible to use previously viewed example faces to learn a general model of how all faces rotate. Novel faces can be encoded relative to these previously encountered “template” faces and thus recognized with some degree of invariance to 3D rotation. Since each object class transforms differently under 3D rotation, it follows that invariant recognition from a single view requires a recognition architecture with a detection step determining the class of an object (e.g. face or non-face) prior to a subsequent identification stage utilizing the appropriate class-specific features

Crossref

Nature Precedings

CNS: a GPU-based framework for simulating cortically-organized networks

Author: Knoblich Ulf
Mutch Jim
Poggio Tomaso
Publication venue
Publication date: 01/01/2010
Field of study

Computational models whose organization is inspired by the cortex are increasing in both number and popularity. Current instances of such models include convolutional networks, HMAX, Hierarchical Temporal Memory, and deep belief networks. These models present two practical challenges. First, they are computationally intensive. Second, while the operations performed by individual cells, or units, are typically simple, the code needed to keep track of network connectivity can quickly become complicated, leading to programs that are difficult to write and to modify. Massively parallel commodity computing hardware has recently become available in the form of general-purpose GPUs. This helps address the first problem but exacerbates the second. GPU programming adds an extra layer of difficulty, further discouraging exploration. To address these concerns, we have created a programming framework called CNS ('Cortical Network Simulator'). CNS models are automatically compiled and run on a GPU, typically 80-100x faster than on a single CPU, without the user having to learn any GPU programming. A novel scheme for the parametric specification of network connectivity allows the user to focus on writing just the code executed by a single cell. We hope that the ability to rapidly define and run cortically-inspired models will facilitate research in the cortical modeling community. CNS is available under the GNU General Public License

CiteSeerX

DSpace@MIT

Neurons That Confuse Mirror-Symmetric Object Views

Author: Jim Mutch
Jim Mutch
Joel Z Leibo
Joel Z Leibo
Steve Smale
Steve Smale
Tomaso Poggio
Tomaso Poggio
Publication venue
Publication date: 01/01/2010
Field of study

Neurons in inferotemporal cortex that respond similarly to many pairs of mirror-symmetric images -- for example, 45 degree and -45 degree views of the same face -- have often been reported. The phenomenon seemed to be an interesting oddity. However, the same phenomenon has also emerged in simple hierarchical models of the ventral stream. Here we state a theorem characterizing sufficient conditions for this curious invariance to occur in a rather large class of hierarchical networks and demonstrate it with simulations

CiteSeerX

DSpace@MIT

How can cells in the anterior medial face patch be viewpoint invariant?

Author: Jim Mutch
Joel Leibo
Joel Leibo
Tomaso Poggio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Crossref

From primal templates to invariant recognition

Author: Leibo Joel Z
Mutch Jim
Poggio Tomaso
Ullman Shimon
Publication venue
Publication date: 04/12/2010
Field of study

We can immediately recognize novel objects seen only once before -- in different positions on the retina and at different scales (distances). Is this ability hardwired by our genes or learned during development -- and if so how? We present a computational proof that developmental learning of invariance in recognition is possible and can emerge rapidly. This computational work sets the stage for experiments on the development of object invariance while suggesting a specific mechanism that may be critically tested

DSpace@MIT

The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work).

Author: Leibo Joel
Mutch Jim
Poggio Tomaso
Rosasco Lorenzo
Tacchetti Andrea
Publication venue
Publication date: 01/01/2012
Field of study

This paper explores the theoretical consequences of a simple assumption: the computational goal of the feedforward path in the ventral stream -- from V1, V2, V4 and to IT -- is to discount image transformations, after learning them during development

CiteSeerX

DSpace@MIT

Does invariant recognition predict tuning of neurons in sensory cortex?

Author: Andrea Tacchetti
Fabio Anselmi
Fabio Anselmi
Jim Mutch
Jim Mutch
Joel Z. Leibo
Joel Z. Leibo
Lorenzo Rosasco
Lorenzo Rosasco
Tomaso Poggio
Tomaso Poggio
Publication venue
Publication date: 12/08/2013
Field of study

Tuning properties of simple cells in cortical V1 can be described in terms of a "universal shape" characterized by parameter values which hold across different species. This puzzling set of findings begs for a general explanation grounded on an evolutionarily important computational function of the visual cortex. We ask here whether these properties are predicted by the hypothesis that the goal of the ventral stream is to compute for each image a "signature" vector which is invariant to geometric transformations, with the the additional assumption that the mechanism for continuously learning and maintaining invariance consists of the memory storage of a sequence of neural images of a few objects undergoing transformations (such as translation, scale changes and rotation) via Hebbian synapses. For V1 simple cells the simplest version of this hypothesis is the online Oja rule which implies that the tuning of neurons converges to the eigenvectors of the covariance of their input. Starting with a set of dendritic fields spanning a range of sizes, simulations supported by a direct mathematical analysis show that the solution of the associated "cortical equation" provides a set of Gabor-like wavelets with parameter values that are in broad agreement with the physiology data. We show however that the simple version of the Hebbian assumption does not predict all the physiological properties. The same theoretical framework also provides predictions about the tuning of cells in V4 and in the face patch AL which are in qualitative agreement with physiology data

CiteSeerX

DSpace@MIT

A hierarchical model of peripheral vision

Author: Isik Leyla
Lee Sang Wan
Leibo Joel Z.
Mutch Jim
Poggio Tomaso
Publication venue
Publication date: 17/06/2011
Field of study

We present a peripheral vision model inspired by the cortical architecture discovered by Hubel and Wiesel. As with existing cortical models, this model contains alternating layers of simple cells, which employ tuning functions to increase specificity, and complex cells, which pool over simple cells to increase invariance. To extend the traditional cortical model, we introduce the option of eccentricity-dependent pooling and tuning parameters within a given model layer. This peripheral vision system can be used to model physiological data where receptive field sizes change as a function of eccentricity. This gives the user flexibility to test different theories about filtering and pooling ranges in the periphery. In a specific instantiation of the model, pooling and tuning parameters can increase linearly with eccentricity to model physiological data found in different layers of the visual cortex. Additionally, it can be used to introduce pre-cortical model layers such as retina and LGN. We have tested the model s response with different parameters on several natural images to demonstrate its effectiveness as a research tool. The peripheral vision model presents a useful tool to test theories about crowding, attention, visual search, and other phenomena of peripheral vision.This work was supported by the following grants: NSF-0640097, NSF-0827427, NSF-0645960, DARPA-DSO, AFSOR FA8650-50-C-7262, AFSOR FA9550-09-1-0606

DSpace@MIT

Unsupervised learning of invariant representations

Author: Anselmi Fabio
Leibo Joel Z.
Mutch Jim
Poggio Tomaso A.
Rosasco Lorenzo
Tacchetti Andrea
Publication venue: 'Elsevier BV'
Publication date: 01/04/2015
Field of study

The present phase of Machine Learning is characterized by supervised learning algorithms relying on large sets of labeled examples (. n\u2192 1e). The next phase is likely to focus on algorithms capable of learning from very few labeled examples (. n\u21921), like humans seem able to do. We propose an approach to this problem and describe the underlying theory, based on the unsupervised, automatic learning of a "good" representation for supervised learning, characterized by small sample complexity. We consider the case of visual object recognition, though the theory also applies to other domains like speech. The starting point is the conjecture, proved in specific cases, that image representations which are invariant to translation, scaling and other transformations can considerably reduce the sample complexity of learning. We prove that an invariant and selective signature can be computed for each image or image patch: the invariance can be exact in the case of group transformations and approximate under non-group transformations. A module performing filtering and pooling, like the simple and complex cells described by Hubel and Wiesel, can compute such signature. The theory offers novel unsupervised learning algorithms for "deep" architectures for image and speech recognition. We conjecture that the main computational goal of the ventral stream of visual cortex is to provide a hierarchical representation of new objects/images which is invariant to transformations, stable, and selective for recognition-and show how this representation may be continuously learned in an unsupervised way during development and visual experienc

DSpace@MIT

Crossref

Archivio istituzionale della ricerca - Università di Genova